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Abstract 

We  show  that  the  asymptotic  variance  constant  in  a  stochastic  simulation  cannot  be 
estimated  consistently  from  batch  means  when  the  number  of  batches  is  held  fixed  as  the  run 
length  increases. 
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1.  Introduction 

In  this  paper  'we-shaii^ hat  there  does  not  exist  a  procedure  to  consistently  estimate  the 
asymptotic  variance  constant  in  a  stochastic  simulation  using  batch  means  when  the  number  cf 
batches  is  held  fixed  as  the  run  length  increases.  Thus,  if  consistency  is  desired,  then  the 
number  of  batches  must  increase  as  the  run  length  increases. 


To  be  precise,  we  must  first  specify  what  we  mean  by  an  estimation  procedure.  To  be 

/j 

interesting,  an  estimation  procedure  should  apply  to  a  large  family  of  stochastic  processes.  |  ■ 
Hence,  let  X  m  (X(r)  :  t  >  0}  be  a  measurable  mapping  from  a  measure  space  (0,2?)  into 
D  •  D[0,  oo),  the  space  ol  right-continuous  real-valued  functions  on  the  interval  [0,  «>)  with 
left  limits,  endowed  with  the  usual  Skorohod  topology  and  associated  Borel  c-field;  e.g.,  see 
Ethier  and  Kurtz  [3].  Of  course,  we  want  the  underlying  space  (Q,  S')  to  be  sufficiently  rich; 
it  suffices  to  let  Q  -  D  and  X(r)  be  the  projection  or  coordinate  map.  We  consider  the  set  2P 
of  all  probability  measures  P  on  (0,3*)  such  that  there  exist  finite  deterministic  constants 
ft  s  p(/>)  and  o  ss  a(P)  such  that 


,1/2 


1  r  ['■/l 

-1  f  X(s)  -  ft 
n  Jo 


cB(t)  as  n  — >  oo 


0) 


where  =>  denotes  weak  convergence  in  D  with  respect  to  P  and  B  s  {B(t)  ;  t  >  0}  is 
standard  (zero-drift,  unit  diffusion  coefficient)  Brownian  motion.  Our  goal  is  to  estimate  a2, 
but  we  want  our  procedure  to  apply  to  all  P  e  <3>.  In  other  words,  the  procedure  should  apply 
to  all  stochastic  processes  X  in  D  satisfying  the  functional  central  limit  theorem  (FCLT)  (1). 

To  apply  the  method  of  batch  means,  we  specify  the  number  m  of  batches  and  the  total  run 
length  T.  We  then  construct  our  estimates  from  the  m  non-overlapping  intcn/als  of  length  77m; 
i.e.,  let  the  1th  batch  mean  be 
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—  rn  f  <77  m 

X,(7)  =  —  J  X(i)rfc  .  1  <  i  <  m  .  (2) 

We  now  want  a  procedure  for  combining  the  m  observations  Xj(T) . Xm{T)  in  such  a 

way  that  a2  is  consistently  estimated  as  7  ->  «,  This  “combining  transformation”  should 
not  depend  on  the  “fine  structure”  of  the  process  X.  In  particular,  it  should  not  depend  on  p 
and  a2.  Thus,  in  this  context  we  say  that  an  ectimation  procedure  is  a  family  of  mCuSUiublc 
mappings 

gT  :  Rm  -4  R  for  7  >  0  ,  (3) 

such  that  the  estimate  of  a2  is  gj(x\ . xm )  when  the  total  run  length  is  7  and 

X,(r)  =  xit  1  <  i  <  m.  Note  that  gj  can  depend  on  7,  but  is  independent  of  P. 

We  say  that  an  estimation  procedure  is  <3>- consistent  if  for  each  P  e  <3* 

gT{Xx(T) . Xm(T))  =>  o2(P)  as  7  — >  °o  .  (4) 

Here  =*»  denotes  weak  convergence  with  respect  to  P  in  R,  which  is  equivalent  to  convergence 
in  probability  since  c2  (P)  is  deterministic.  Since  we  have  a  negative  result,  we  focus  on  this 
weak  consistency.  We  would  have  strong  consistency  if  the  convergence  was  w.p.  1  with 
respect  to  P. 

Here  is  our  main  result.  It  applies  to  any  m. 

Theorem  1.  There  does  not  exist  an  estimation  procedure  that  is  (3>- consistent . 

In  Section  2  we  show  what  happens  with  the  standard  variance  estimator.  We  see  that  we 
do  not  get  consistency  for  o2  for  any  fixed  m,  but  we  can  get  as  close  as  we  wish  by  letting  m 
be  suitably  large.  In  Section  3  we  prove  Theorem  1. 

Theorem  1  has  applications  to  sequential  stopping.  It  shows  that  the  sufficient  conditions 
in  Glynn  and  Whitt  [5]  for  asymptotic  validity  are  not  satisfied  when  the  number  of  hutches  is 
held  fixed  as  the  run  length  grows. 
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For  a  fixed  run  length,  our  analysis  shows  that  it  is  desirable  to  pick  the  number  of  batches 
as  large  as  possible  without  seriously  violating  the  assumption  that  the  batches  are  independent 
and  identically  distributed  (i.i.d.)  with  a  normal  distribution.  However,  the  i.i.d.  normal 
assumption  typically  holds  only  as  an  approximation  and  then  only  when  there  are  large  batch 
sizes.  Statistical  tests  can  be  used  to  validate  the  assumption,  but  repeated  tests  of  significance 
on  the  same  data  are  fraught  with  peril,  both  theoretically  and  empirically.  Hence, 
Schmeiser  [9]  suggested  using  a  relatively  small  fixed  number  of  batches,  e.g.,  about  20.  This 
avoids  the  complications  above  and  gives  relatively  robust  confidence  intervals.  However,  we 
show  that  this  is  achieved  at  the  expense  of  consistency  for  the  variance  estimator. 
Asymptotically  valid  confidence  intervals  are  obtained  anyway  of  course  by  cancellation 
methods,  i.e.,  using  the  t  distribution.  For  further  discussion,  see  Schmeiser  [9],  Goldsman  and 
Meketon  [6],  Sargent,  Kang  and  Goldsman  [8],  Glynn  and  Iglehart  [4]  and  Damerdji  [2], 


□  □ 
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r  iT/m  r  T 

•/v  v  X(s)ds  _J  X{s)ds 

st(X\(T) . xm(T ))  =  — -  £  _ Vr-2 _ 

m_1  *  =  i  T  mT 


<l2  — ~r  £  IflO'/m)  -  fi((<  -  l)/m)  -  =  °  X"~‘  by  (1) 

m  - 1  m  m- 1 


Note  that  o~X«- \hn-\  has  mean  cr  and  variance  2 o4/(/«  -  1);  see  p.  168  of  Johnson 
and  Kotz  [7],  Moreover,  as  m  increases, 

o2v2 
u  A.™  . 


jf:TL  _  ct2  =>/V(0,2o4) 


where  N(/n,  a2)  denotes  a  normally  distributed  random  variable  with  mean  m  and  variance  cr. 
Hence,  we  can  get  as  close  as  we  want  if  we  choose  m  suitably  large.  Moreover,  we  can 
obtain  consistency  under  extra  regularity  conditions  if  m  — *  °°  and  T  -»  oo  so  that  Tim  — >  »; 
see  Goldsman  and  Meketon  [6]  and  Damerdji  [2],  In  fact,  Damerdji  even  proves  strong 
consistency  for  a  class  of  stochastic  processes. 


3.  Proof  of  Theorem  1 


To  establish  the  negative  result,  it  suffices  to  restrict  attention  to  probability  measures  P 
such  that  X  coincides  with  cB  where  B  is  standard  Brownian  motion.  Then,  for  any  m  and  T, 


(X,(T),  .  .  .  ,Xm(T))  =  jCB{Tlm) . jO[B(T)  -  B(T(m- ,  (7) 


so  that  the  batch  means  are  distributed  exactly  as  m  i.i.d.  normal  random  variables  with  mean 
0  and  variance  G2m/T.  Without  loss  of  generality,  we  can  remove  the  mlT  factor  by 
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considering  the  transformed  functions 


Note  that 

gT(Xi(T) . Xm(T))  =  «r(oN)  for  all  T  ,  (9) 

where  aN  s  (oNl ,  ,  csNm)  and  N  is  a  fixed  vector  of  i.i.d.  standard  (mean  0,  variance  1) 

normal  random  variables. 

Now  consistency  requires  that 

g7-(cN)  =>  <3  as  T  — »  oo  (10) 

for  all  a  >  0,  but  this  cannot  happen  for  two  or  more  different  positive  values  of  a,  say  a  i 
and  o2.  To  see  this,  first  note  that  the  convergence  in  probability  for  0\  in  (10)  implies  that 
there  is  a  sequence  {Tn  :  n  >  1 }  of  deterministic  positive  numbers  with  Tn  -*  °o  such  that 

Ir.(OjN)  — »  O]  w.p.l  as  n  ->  oo  ;  (11) 

see  Theorem  4.2.3  of  Chung  [1],  By  (10),  gr>(o2N)  =s>  o2  as  n  ->  oo.  Hence,  there  is  a 
deterministic  subsequence  {T'n  :  n  >  1 }  of  {Tn  :  n  >  1 }  such  that 

<?r.(o.N)  a,  w.p.l  as  n  -)  oo  (12) 

for  both  i  =  1  and  2.  Hence,  for  i  =  1  and  2,  gr.M  —>  o,  for  almost  all  x  with  respect  to 
the  law  of  o,  N,  which  implies  that  (x)  — *  o,-  for  almost  all  x  with  respect  to  Lcbcsguc 
measure  on  Rm,  since  o,/V  has  a  positive  density  with  respect  to  Lebcsgue  measure.  However, 
it  is  not  possible  to  have  gr.(x)  simultaneously  converge  almost  everywhere  with  respect  to 
Lebesgue  measure  to  two  different  limits.  (The  set  of  convergence  to  one  limit  must  be 
contained  in  the  null  set  of  non-convergence  for  the  other  limit.) 
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